Search CORE

41 research outputs found

Recommended from our members

Robust Machine Learning by Integrating Context

Author: Mao Chengzhi
Publication venue
Publication date: 01/01/2023
Field of study

Intelligent software has the potential to transform our society. It is becoming the building block for many systems in the real world. However, despite the excellent performance of machine learning models on benchmarks, state-of-the-art methods like neural networks often fail once they encounter realistic settings. Since neural networks often learn correlations without reasoning with the right signals and knowledge, they fail when facing shifting distributions, unforeseen corruptions, and worst-case scenarios. Since neural networks are black-box models, they are not interpretable or trusted by the user. We need to build robust models for machine learning to be confidently and responsibly deployed in the most critical applications and systems. In this dissertation, I introduce our robust machine learning systems advancements by tightly integrating context into algorithms. The context has two aspects: the intrinsic structure of natural data, and the extrinsic structure from domain knowledge. Both are crucial: By capitalizing on the intrinsic structure in natural data, my work has shown that we can create robust machine learning systems, even in the worst case, an analytical result that also enjoys strong empirical gains. Through integrating external knowledge, such as the association between tasks and causal structure, my framework can instruct models to use the right signals for inference, enabling new opportunities for controllable and interpretable models. This thesis consists of three parts. In the first part, I aim to cover three works that use the intrinsic structure as a constraint to achieve robust inference. I present our framework that performs test-time optimization to respect the natural constraint, which is captured by self-supervised tasks. I illustrate that test-time optimization improves out-of-distribution generalization and adversarial robustness. Besides the inference algorithm, I show that intrinsic structure through discrete representations also improves out-of-distribution robustness. In the second part of the thesis, I then detail my work using external domain knowledge. I first introduce using causal structure from external domain knowledge to improve domain generalization robustness. I then show how the association of multiple tasks and regularization objectives helps robustness. In the final part of this dissertation, I show three works on trustworthy and reliable foundation models, a general-purpose model that will be the foundation for many AI applications. I show a framework that uses context to secure, interpret, and control foundation models

Columbia University Academic Commons

Convolutional Visual Prompt for Robust Visual Perception

Author: Mao Chengzhi
Tsai Yun-Yun
Yang Junfeng
Publication venue
Publication date: 25/10/2023
Field of study

Vision models are often vulnerable to out-of-distribution (OOD) samples without adapting. While visual prompts offer a lightweight method of input-space adaptation for large-scale vision models, they rely on a high-dimensional additive vector and labeled data. This leads to overfitting when adapting models in a self-supervised test-time setting without labels. We introduce convolutional visual prompts (CVP) for label-free test-time adaptation for robust visual perception. The structured nature of CVP demands fewer trainable parameters, less than 1\% compared to standard visual prompts, combating overfitting. Extensive experiments and analysis on a wide variety of OOD visual perception tasks show that our approach is effective, improving robustness by up to 5.87% over several large-scale models

arXiv.org e-Print Archive

Robustifying Language Models with Test-Time Adaptation

Author: Mao Chengzhi
McDermott Noah Thomas
Yang Junfeng
Publication venue
Publication date: 29/10/2023
Field of study

Large-scale language models achieved state-of-the-art performance over a number of language tasks. However, they fail on adversarial language examples, which are sentences optimized to fool the language models but with similar semantic meanings for humans. While prior work focuses on making the language model robust at training time, retraining for robustness is often unrealistic for large-scale foundation models. Instead, we propose to make the language models robust at test time. By dynamically adapting the input sentence with predictions from masked words, we show that we can reverse many language adversarial attacks. Since our approach does not require any training, it works for novel tasks at test time and can adapt to novel adversarial corruptions. Visualizations and empirical results on two popular sentence classification datasets demonstrate that our method can repair adversarial language attacks over 65% oComment: 8 Pages 2 Figures Submitted to ICLR Worksho

arXiv.org e-Print Archive

1st ICLR International Workshop on Privacy, Accountability, Interpretability, Robustness, Reasoning on Structured Data (PAIR^2Struct)

Author: He Hao
Lin Wanyu
Mao Chengzhi
Wang Di
Wang Hao
Zhang Muhan
Publication venue
Publication date: 07/10/2022
Field of study

Recent years have seen advances on principles and guidance relating to accountable and ethical use of artificial intelligence (AI) spring up around the globe. Specifically, Data Privacy, Accountability, Interpretability, Robustness, and Reasoning have been broadly recognized as fundamental principles of using machine learning (ML) technologies on decision-critical and/or privacy-sensitive applications. On the other hand, in tremendous real-world applications, data itself can be well represented as various structured formalisms, such as graph-structured data (e.g., networks), grid-structured data (e.g., images), sequential data (e.g., text), etc. By exploiting the inherently structured knowledge, one can design plausible approaches to identify and use more relevant variables to make reliable decisions, thereby facilitating real-world deployments

arXiv.org e-Print Archive

Robust Perception through Equivariance

Author: Joshi Abhishek
Mao Chengzhi
Vondrick Carl
Wang Hao
Yang Junfeng
Zhang Lingyu
Publication venue
Publication date: 12/12/2022
Field of study

Deep networks for computer vision are not reliable when they encounter adversarial examples. In this paper, we introduce a framework that uses the dense intrinsic constraints in natural images to robustify inference. By introducing constraints at inference time, we can shift the burden of robustness from training to the inference algorithm, thereby allowing the model to adjust dynamically to each individual image's unique and potentially novel characteristics at inference time. Among different constraints, we find that equivariance-based constraints are most effective, because they allow dense constraints in the feature space without overly constraining the representation at a fine-grained level. Our theoretical results validate the importance of having such dense constraints at inference time. Our empirical experiments show that restoring feature equivariance at inference time defends against worst-case adversarial perturbations. The method obtains improved adversarial robustness on four datasets (ImageNet, Cityscapes, PASCAL VOC, and MS-COCO) on image recognition, semantic segmentation, and instance segmentation tasks. Project page is available at equi4robust.cs.columbia.edu

arXiv.org e-Print Archive

Bidirectional Inference Networks: A Class of Deep Bayesian Networks for Health Profiling

Author: He Hao
Jaakkola Tommi S.
Katabi Dina
Mao Chengzhi
Wang Hao
Zhao Mingmin
Publication venue
Publication date: 06/02/2019
Field of study

We consider the problem of inferring the values of an arbitrary set of variables (e.g., risk of diseases) given other observed variables (e.g., symptoms and diagnosed diseases) and high-dimensional signals (e.g., MRI images or EEG). This is a common problem in healthcare since variables of interest often differ for different patients. Existing methods including Bayesian networks and structured prediction either do not incorporate high-dimensional signals or fail to model conditional dependencies among variables. To address these issues, we propose bidirectional inference networks (BIN), which stich together multiple probabilistic neural networks, each modeling a conditional dependency. Predictions are then made via iteratively updating variables using backpropagation (BP) to maximize corresponding posterior probability. Furthermore, we extend BIN to composite BIN (CBIN), which involves the iterative prediction process in the training stage and improves both accuracy and computational efficiency by adaptively smoothing the optimization landscape. Experiments on synthetic and real-world datasets (a sleep study and a dermatology dataset) show that CBIN is a single model that can achieve state-of-the-art performance and obtain better accuracy in most inference tasks than multiple models each specifically trained for a different task.Comment: Appeared at AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Doubly Right Object Recognition: A Why Prompt for Visual Rationales

Author: Mao Chengzhi
Menon Sachit
Sundar Amrutha
Teotia Revant
Vondrick Carl
Wang Xin
Yang Junfeng
Publication venue
Publication date: 12/12/2022
Field of study

Many visual recognition models are evaluated only on their classification accuracy, a metric for which they obtain strong performance. In this paper, we investigate whether computer vision models can also provide correct rationales for their predictions. We propose a ``doubly right'' object recognition benchmark, where the metric requires the model to simultaneously produce both the right labels as well as the right rationales. We find that state-of-the-art visual models, such as CLIP, often provide incorrect rationales for their categorical predictions. However, by transferring the rationales from language models into visual representations through a tailored dataset, we show that we can learn a ``why prompt,'' which adapts large visual representations to produce correct rationales. Visualizations and empirical experiments show that our prompts significantly improve performance on doubly right object recognition, in addition to zero-shot transfer to unseen tasks and datasets

arXiv.org e-Print Archive